The NIST Meeting Room Pilot Corpus
نویسندگان
چکیده
One of the next big challenges in Automatic Speech Recognition (ASR) is the transcription of speech in meetings. This task is particularly problematic for current recognition technologies because, in most realistic meeting scenarios, the vocabularies are unconstrained, the speech is spontaneous and often overlapping, and the microphones are inconspicuously placed. To support the development of meeting recognition technologies by both the speech recognition and video extraction research communities, NIST is providing a development and evaluation infrastructure including: a multi-media corpus of audio and video from meetings collected at NIST using a variety of microphones and video cameras, new evaluation protocols, metrics, software, rich transcription conventions, sponsoring evaluations and workshops, facilitating multi-site data pooling, and helping bring the community together to focus on the technical challenges. To date, NIST has collected a pilot corpus of 15 hours of meetings in its specially-instrumented Meeting Data Collection Laboratory. The corpus includes digital recordings from close-talking mics, lapel mics, distantly-placed mics, 5 digitally-recorded camera views, and full speaker/word-level transcripts. This data is being used in the development and evaluation of speech technologies and by the video extraction community under the auspices of the ARDA Video Analysis and Content Exploitation (VACE) program.
منابع مشابه
The NIST Meeting Room Corpus 2 Phase 1
The National Institute of Standards and Technology’s Information Access Divsion has collected a second phase of meetings in the NIST Meeting Data Collection Laboratory. The meeting laboratory, which was used to collect a 15 h pilot corpus beginning in 2001, was updated with 7 High Definition Video (HDV) cameras and new head microphones for participants to collect a twenty hour corpus to support...
متن کاملShared Linguistic Resources for the Meeting Domain
This paper describes efforts by the University of Pennsylvania's Linguistic Data Consortium to create and distribute shared linguistic resources – including data, annotations, tools and infrastructure – to support the Spring 2007 (RT-07) Rich Transcription Meeting Recognition Evaluation. In addition to making available large volumes of training data to research participants, LDC produced refere...
متن کاملLinguistic Resources for the Meeting Domain
This paper describes efforts by the University of Pennsylvania's Linguistic Data Consortium to create and distribute shared linguistic resources – including data, annotations, and tools – to support the Spring 2009 (RT-09) Rich Transcription Meeting Recognition Evaluation. In addition to making available large volumes of training data to research participants, LDC produced reference transcripts...
متن کاملThe segmentation of multi-channel meeting recordings for automatic speech recognition
One major research challenge in the domain of the analysis of meeting room data is the automatic transcription of what is spoken during meetings, a task which has gained considerable attention within the ASR research community through the NIST rich transcription evaluations conducted over the last three years. One of the major difficulties in carrying out automatic speech recognition (ASR) on t...
متن کاملPanasonic Real-time Meeting Room Stt
In this paper, we describe a real-time speech-to-text (STT) system for Meeting Room (MR) recognition developed at Panasonic. The system is an evolution of Panasonic’s Broadcast News (BN) STT system that was evaluated at the NIST Rich Transcription (RT) 03S event. Newest features of interest include syllable models and merged Multiple Heteroscedastic Linear Discriminant Analysis (MHLDA) feature ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004